Rectangular Full Packed Format for LAPACK Algorithms Timings on Several Computers

نویسندگان

  • Fred G. Gustavson
  • Jerzy Wasniewski
چکیده

We describe a new data format for storing triangular and symmetric matrices called RFP (Rectangular Full Packed). The standard two dimensional arrays of Fortran and C (also known as full format) that are used to store triangular and symmetric matrices waste nearly half the storage space but provide high performance via the use of level 3 BLAS. Standard packed format arrays fully utilize storage (array space) but provide low performance as there are no level 3 packed BLAS. We combine the good features of packed and full storage using RFP format to obtain high performance using L3 (level 3) BLAS as RFP is full format. Also, RFP format requires exactly the same minimal storage as packed format. Each full and/or packed symmetric/triangular routine becomes a single new RFP routine. We present LAPACK routines for Cholesky factorization, inverse and solution computation in RFP format to illustrate this new work and to describe its performance on the IBM, Itanium, NEC, and SUN platforms. Performance of RFP versus LAPACK full routines for both serial and SMP parallel processing is about the same while using half the storage. Performance is roughly one to a factor of 33 for serial and one to a factor of 100 for SMP parallel times faster than LAPACK packed routines. Existing LAPACK routines and vendor LAPACK routines were used in the serial and the SMP parallel study, respectively. In both studies vendor L3 BLAS were used.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LAPACK Cholesky Routines in Rectangular Full Packed Format

We describe a new data format for storing triangular and symmetric matrices called RFP (Rectangular Full Packed). The standard two dimensional arrays of Fortran and C (also known as full format) that are used to store triangular and symmetric matrices waste half the storage space but provide high performance via the use of level 3 BLAS. Packed format arrays fully utilize storage (array space) b...

متن کامل

Three Algorithms for Cholesky Factorization on Distributed Memory Using Packed Storage

We present three algorithms for Cholesky factorization using minimum block storage for a distributed memory (DM) environment. One of the distributed square blocked packed (SBP) format algorithms performs similar to ScaLAPACK PDPOTRF, and with iteration overlapping outperforms it by as much as 67%. By storing the blocks in a standard contiguous way, we get better performing BLAS operations. Our ...

متن کامل

New Generalized Data Structures for Matrices Lead to a Variety of High Performance Algorithms

We describe new data structures for full storage of general matrices that generalize the current storage layouts of the Fortran and C programming languages. We also describe new data structures for full and packed storage of dense symmetric/triangular arrays that generalize both full and packed storage. Using the new data structures, one is led to several new algorithms that save “half” the sto...

متن کامل

Optimizing Locality of Reference in Cholesky Algorithms1

This paper presents the principle ideas involved in hierarchical blocking, introduces the block packed storage scheme, and gives the implementation details and the performance rates of the hierarchically blocked Cholesky factorization. In some cases the newly developed routines are faster by an order of magnitude than the corresponding Lapack routines. Introduction Most current computers based ...

متن کامل

A distributed packed storage for large dense parallel in-core calculations

We propose in this paper a distributed packed storage format that exploits the symmetry or the triangular structure of a dense matrix. This format stores only half of the matrix while maintaining most of the efficiency compared to a full storage for a wide range of operations. This work has been motivated by the fact that, contrary to sequential linear algebra libraries (e.g. LAPACK [4]), there...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006